Introduction¶
Background¶
FOMC:¶
The Federal Open Market Committee (FOMC) meets eight times a year. During these meetings, members of the Federal Reserve discuss the current state of the economy and set the target for the federal funds rate. (What is the Federal Funds Rate?). The Federal Funds Rate has significant implications for the economy, as it influences the cost of borrowing money, inflation, unemployment, and more. Due to its importance, unexpected decisions at the FOMC meetings can lead to substantial market swings.
Beige Book:¶
Officially known as the “Summary of Commentary on Current Economic Conditions,” the Beige Books allows for each of the twelve Federal Reserve Districts to providing a regional perspective on economic conditions. The Beige Book is released two weeks prior to and is used by the Federal Open Market Committee (FOMC) as one of several inputs to inform their monetary policy decisions.
Project Statement: Analyzing the Impact of Federal Reserve Announcement on Major Economic Indicators¶
In the financial markets, Federal Reserve announcements are highly anticipated events. These announcements have the power to influence investors and to trigger shifts in economic indicators. The major indicators involved are the S&P 500, Dow Jones industrial average, and the NASDAQ composite. However, this impact has been traditionally vague, and it is important to understand the specific nuances of these impacts. This is especially important when monitoring the real-time impact of these announcements. Even with the availability of text from these announcements and with financial indicators it remains an overly complex and challenging task to draw on these correlations. The goal of our project is to bridge this gap by creating a comprehensive analytical tool that will leverage natural language processing and predictive modeling techniques. This will allow us to quantify the effects of these Federal Reserve communications on the financial markets.
At the core of this analysis is the idea that specific language used in these Federal Reserve announcements will give critical information to the market that may shape perceptions and reactions. Language used in these announcements include the Beige Book, FOMC meeting notes, and other policy documents. By applying this advanced text analysis method, we aim to label market increases and decreases. We seek to accomplish this by extracting sentiment scores and thematic elements from these announcements. This insight will then be integrated with financial development data to develop predictive models that are capable of forecasting market movements in response to these announcements.
This project is designed to serve a broad audience that will include investors, policymakers, and financial analysts. We seek to provide a tool that offers deeper insights into the economic impact of these Federal Reserve decisions. To accomplish this, we seek to transfer complex financial and linguistic data into actionable predictions. As a result, our goal is to empower stakeholders to make informed decisions in a volatile market environment that is fueled by the sentiment of these Federal Reserve announcements.
In summary, our project addresses the need to improve the nuanced understanding of the relationship between Federal Reserve announcements and market behavior. By offering this data-driven approach to predict and interpret the economic impact of these announcements we can provide a valuable tool to analysts that are observing this related market activity.
Context¶
Our project will build upon an extensive body of existing research and methodologies that have explored the relationship between Federal Reserve announcements and financial market reactions. There have been several key studies in sources that have informed our approach. These have provided both theoretical foundations and practical insights.
The first study is Niemira and Klein’s (1994) Work on the Beige Book as a forecasting tool. This work underscored the significance of Federal Reserve communications in influencing market behavior. Their research highlighted how qualitative information such as that found in the Beige Book can provide early signals of economic trends. This was later reflected in their analysis of quantitative data. The study gave us insight into the use of natural language processing to extract sentiment and key insights from these Federal Reserve documents.
The second work is Bernanke and Kuttner’s (2005) seminole paper on the stock markets response to Federal Reserve policy actions. This study was crucial to our understanding of the mechanisms through which monetary policy can impact financial markets. Their findings on the sensitivity of stock prices to unexpected changes in the federal funds rate helped to shape our predictive modeling efforts. We were particularly interested in assessing the immediate impact of reactions to FOMC announcements.
Finally, the blog post by D’Amico and King (2021) On the federal reserve's communication strategies gave a contemporary perspective on how language used by the Fed can shape market expectations and behaviors. Their analysis on this topic gave great insight into the Feds evolving communication style. This was especially true during times of economic uncertainty. This aligned with our focus on text and sentiment analysis to capture market sentiment and response to different types of announcements.
These references gave great aid along with others that were integrated into our research. These studies provide us with a great foundation and a necessary context for our project. They established the relevance of analyzing Federal Reserve announcements and validated the use of such tools as natural language processing for sentiment analysis. These tools also highlighted the importance of predictive modeling to deepen our understanding of their impact on the financial markets.
Methodology¶
The policy direction released during the FOMC is a major source of speculation in the market. Accurately predicting the Fed's decision, or predicting it better than the market's expectation, has the potential for significant profits. Institutions and retail investor alike create predictive models based off the vast amounts of potential explanatory variables. This creates a system of intense competition which lead us to explore niche relationship using novel techniques. To increase the likelihood of discovering an interesting relationship, we decided to explore two different hypotheses:
- Hypothesis 1: The Fed's Beige Books contain unincorporated correlative information with the released fed policy.
- Hypothesis 2: The FOMC statement, itself, can be used to predict the direction of the market.
Hypothesis 1:¶
All regression tasks are mathematical processes making quantitative data the easiest and intuitive choice for developing a predictive regression model. The relative difficulty in developing a regressive model using qualitative information means the competitive space is smaller. Therefore, our hypothesis is that the Beige Books may contain information that is not fully incorporated into expectation, and the recent advancements in natural language processing (NLP) may allow us to realize relationship previously undiscovered.
The above graph depicts the daily closing price of the SP500 since January 1, 1970. There are optional tick marks that show the dates of release for the Beige Books then the date of the FOMC. The tick marks for the beige books are beige and when you hover over a tick it will show the key phrases associated with that beige book for each district. The tick marks for the FOMC are green or red, they are green if the five day average closing price after the FOMC date is higher than the five day average closing price prior to the FOMC, hovering over a green/red tick will show the percentage change in the market between the five days prior and five day after the FOMC.
Method¶
Natural language processing (NLP) has been around for decades, but recent advancements in the field, transformers (Vaswani et al., 2017), have greatly advanced the field in a short period. To predict the FOMC's impact on the market, we leveraged BERT (Devlin et al., 2018), a transformed based model. This model, however, is not a regression model, so we developed a novel architecture which takes the output of BERT and outputs a single float value which is the model's prediction of the FOMC on the market.
The BERT backbone of our model was pre-trained on financial data. The output of the BERT portion is a (13, 3) tensor where the thirteen corresponds to the twelve FED districts plus national summary and the three corresponds to the sentiment scores (positive, neutral, negative). We then ran this (13, 3) tensor through two convolutional blocks: the first integrates sentiments values across all districts and one sentiment score while the second integrates values across one district and all sentiment scores. The output of these convolutional block, (1, 3) and (13, 1), were then matrix-matrix multiplied to produce a new (13, 3) tensor. This convolution integrated output was concatenated with the output of another convolution block fully to create a (13, 3) feature map. This intermediate tensor was then run through and FNN and a convolutional block and the outputs were added to create a single float value which represents the AI's prediction of the FOMC on the market given the Beige Book.
Evaluation Strategy¶
The most recent 10% of data was used for testing, the next most recent 10% was used for validation, and the remaining data was used for training. The model was evaluated using a variety of methods: including R-squared (R2), mean square error (MSE), and root mean-square error (RMSE). Furthermore, the actual surprise and predicted surprise were binned into market increase or decrease and a confusion matrix was used to assess model performance. Success was defined to be anything better than random guessing.
Results¶
| Dataset | R2 | MSE | RMSE |
|---|---|---|---|
| train | 0.6952 | 0.0002 | 0.0128 |
| val | -0.5282 | 0.0007 | 0.0273 |
| test | -1.3099 | 0.0012 | 0.0344 |
Interpretation of the table above is as follows: it appears that the model did a decent job learning during training. The training R2 score of 0.6952 indicates that the model was able to explain roughly 70% of the variance in the data, which is quite excellent given the difficulty of the task. However, it does not seem that the model was able to generalize that knowledge, indicating that the model was memorizing the training set or overfitting. It is very likely all the info in the beige book, that correlates to a change in market prices, has been priced in already. In this scenario, we would expect to see the model's outputs appear similar to random chance.
The confusion matrix (above) appears to confirm this theory. When we bin the actual and predicted market response and visualize the confusion matrix we can see that the model's outputs are no better than random guessing
Discussion¶
Deep learning methods have been called a black box because of their uninterpretable nature. However, this model design allows for some interpretation because of the convolutional block. Specifically, the (13, 1) convolutional block has a one-to-one relationship with the thirteen districts and the (1, 3) convolutional block has a one-to-one relationship with the sentiment scores. Therefore, we can matrix multiply those convolutional blocks together to get a (13, 3) array which gives an intermediary view of the AI's decision-making process.
Because this heatmap is from an intermediary step of the deep learning framework, we cannot interpret these parameter weights as a one-to-one relationship with districts and sentiment scores. However, we can look at what the AI is weighing as most important at this step. The AI seems to be keying in on the beige book portions from Cleveland, Kansas City, the National Summary, and San Francisco.
Hypothesis 2:¶
Following an FOMC release, the market typically experiences both an immediate reaction and a delayed reaction as it continues to digest the information and policy presented. Our goal is to predict the change in market direction after the FOMC releases its statement using the statement itself. We hypothesize that while most analyses focus on predicting the immediate market impact, the long-term impact is often neglected. Our aim is to forecast this longer-term change in market direction.
We define change in market direction by first taking the percent change in price from the 20 trading days before the statement as the base direction. For each subsequent day, we calculate the percent change from the day of the statement. The difference between the change after the statement and the baseline direction before it represents the change in market direction.
Method¶
To test this hypothesis, we developed two natural language processing (NLP) models that take FOMC statements as input and predict a single value per ticker per day after the release of the statement. The architecture for both models follows these steps:
- Sentence Selection: Extract relevant sentences from the statement using a custom SentenceSelector.
- Embedding Creation: Generate an embedding for the statement.
- Dimensionality Reduction: Reduce the dimensions of the embedding.
- Regression: Regress the change in market direction on the principal components.
Given our experience, much of the text in FOMC statements is not directly relevant and can detract from model performance. To address this, we designed the SentenceSelector, which breaks the statement into paragraphs and identifies the most relevant ones using hand-picked examples. This selector employs a transformer model to create embeddings and a K-nearest neighbors regressor to extract the relevant sentences.
Additionally, because this task is complex and involves many potential confounding variables, we prioritized model architectures that allow for interpretability of their results. Our goal is to understand the rationale behind the model’s decisions, which we believe could reveal novel or interesting relationships between the FOMC statement and market changes. To achieve this, we use SHAP values—a measure of how important each word or phrase is to the prediction. These values will help us identify key patterns that underlie the FOMC statements.
The following two diagrams depict the complete architectures for TF-IDF Model and the Transformer Model, respectfully.
Pipeline(steps=[('sentenceselector',
SentenceSelector(condition=Exclude voting, email, notes and sentences with less than 15 words,
encoder=philschmid/bge-base-financial-matryoshka,
estimator=KNeighborsRegressor(),
examples=Contains ('the committee decided' & '{x}percent') | 'Federal Reserve Actions',
splitter=Split by line)),
('tfidfvectorizer', TfidfVectorizer(max_df=0.5, min_df=0.05)),
('nmf', NMF(max_iter=1000, n_components=6, random_state=0)),
('linearregression', LinearRegression())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('sentenceselector',
SentenceSelector(condition=Exclude voting, email, notes and sentences with less than 15 words,
encoder=philschmid/bge-base-financial-matryoshka,
estimator=KNeighborsRegressor(),
examples=Contains ('the committee decided' & '{x}percent') | 'Federal Reserve Actions',
splitter=Split by line)),
('tfidfvectorizer', TfidfVectorizer(max_df=0.5, min_df=0.05)),
('nmf', NMF(max_iter=1000, n_components=6, random_state=0)),
('linearregression', LinearRegression())])SentenceSelector(condition=Exclude voting, email, notes and sentences with less than 15 words,
encoder=philschmid/bge-base-financial-matryoshka,
estimator=KNeighborsRegressor(),
examples=Contains ('the committee decided' & '{x}percent') | 'Federal Reserve Actions',
splitter=Split by line)KNeighborsRegressor()
KNeighborsRegressor()
TfidfVectorizer(max_df=0.5, min_df=0.05)
NMF(max_iter=1000, n_components=6, random_state=0)
LinearRegression()
Pipeline(steps=[('sentenceselector',
SentenceSelector(condition=Exclude voting, email, notes and sentences with less than 15 words,
encoder=philschmid/bge-base-financial-matryoshka,
estimator=KNeighborsRegressor(),
examples=Contains ('the committee decided' & '{x}percent') | 'Federal Reserve Actions',
splitter=Split by line)),
('encodertransformer',
EncoderTransformer(model_name='philschmid/bge-base-financial-matryoshka')),
('pca', PCA(n_components=24)),
('linearregression', LinearRegression())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('sentenceselector',
SentenceSelector(condition=Exclude voting, email, notes and sentences with less than 15 words,
encoder=philschmid/bge-base-financial-matryoshka,
estimator=KNeighborsRegressor(),
examples=Contains ('the committee decided' & '{x}percent') | 'Federal Reserve Actions',
splitter=Split by line)),
('encodertransformer',
EncoderTransformer(model_name='philschmid/bge-base-financial-matryoshka')),
('pca', PCA(n_components=24)),
('linearregression', LinearRegression())])SentenceSelector(condition=Exclude voting, email, notes and sentences with less than 15 words,
encoder=philschmid/bge-base-financial-matryoshka,
estimator=KNeighborsRegressor(),
examples=Contains ('the committee decided' & '{x}percent') | 'Federal Reserve Actions',
splitter=Split by line)KNeighborsRegressor()
KNeighborsRegressor()
EncoderTransformer(model_name='philschmid/bge-base-financial-matryoshka')
PCA(n_components=24)
LinearRegression()
Evaluation Strategy¶
We will use balanced accuracy as the key metric to evaluate our models. In addition, we will use Pearson correlation to determine if our model's predictions are correlated with the true change in market direction. This will help us understand if our model is on target by correlating with the true market direction. Furthermore, we will conduct temporal analysis to determine the optimal prediction range of our model
Results¶
The models were evaluated on a hold-out set. We calculated predictions for each day and derived relevant statistics for each ticker. These statistics were then averaged across all days to evaluate overall performance for each ticker. In this case, accuracy is defined as the balanced accuracy score, which accounts for class imbalance in the binary prediction of market change (positive or negative).
Tf-idf model results:¶
| Ticker | Accuracy | R2 | Pearson |
|---|---|---|---|
| S&P 500 | 0.616167 | 0.196419 | 0.479717 |
| Russell 2000 | 0.565837 | 0.129949 | 0.370180 |
| NASDAQ Composite | 0.599512 | 0.098927 | 0.360678 |
| Volatility Index | 0.444170 | 0.146647 | 0.444906 |
| 13 Week Treasury Bill | 0.491211 | -21.937889 | -0.132585 |
| Treasury Yield 30 Years | 0.540284 | 0.001057 | 0.293874 |
Transformer model results:¶
| Ticker | Accuracy | R2 | Pearson |
|---|---|---|---|
| S&P 500 | 0.588303 | 0.100980 | 0.357646 |
| Russell 2000 | 0.534703 | 0.050358 | 0.234498 |
| NASDAQ Composite | 0.527710 | 0.054103 | 0.272912 |
| Volatility Index | 0.482865 | 0.039945 | 0.212677 |
| 13 Week Treasury Bill | 0.484147 | -3.500579 | -0.130257 |
| Treasury Yield 30 Years | 0.518842 | -0.062128 | 0.132679 |
The models showed decent performance on the test set in terms of accuracy, with the transformer model’s predictions showing a good correlation with true prices. However, the models performed poorly on the 13 Week Treasury Bill and the Volatility Index, likely due to their sensitivity to short-term market shocks. Conversely, the models performed best on the S&P 500, which is unsurprising given that it is the most representative index of the overall market.
Despite decent accuracy, the model’s performance is hindered by the multitude of external factors influencing market direction. However, the observed correlations for most tickers suggest that the models capture some genuine effects of the FOMC statements, even though other confounding factors are at play.
A notable pattern emerges around the 20-day mark, where we observe a significant increase in the correlation between our predictions and actual market direction for stock-related tickers. This suggests that the model might be detecting effects related more to the actual implementation of policies rather than the immediate reaction to the statement text. This delayed correlation could indicate that the market takes time to fully price in the implications of the policy changes discussed in the FOMC statements.
Discussion¶
To decipher the transformer model’s “black box,” we observed some clear trends. By randomly leaving out words over many iterations, the SHAP model allows us to see the effects of each word on the model’s output. To use this plot, select a ticker. The plot then shows the contributions of each word or phrase to the prediction, with positive impacts in red and negative impacts in blue. Some of the major patterns include:
- Consistency leads to a decrease in the Volatility Index, while actions or movement cause it to increase.
- The 30-Year Treasuries favor consistency and long-term goals.
- Stocks increase with mentions of maximum employment goals and interest rates but decrease with other actions of the Federal Reserve.
Click on a ticker to see the effects of the statement on the prediction (depending on your screen, you may need to zoom out).
Date: 2021-06-16
Final Remarks¶
Taken together the results of out work seems to indicate the market has effectively priced in the statements released by the Fed. However, we believe that future research is necessary to confirm this finding.
Broader Impacts¶
Our project aims to predict the impact of Federal Reserve announcements on major economic indicators. This project holds significant implications for a wide range of stakeholders. Several individuals stand to benefit from this analytical tool including investors, policymakers, financial analysts, and economist. By utilizing this tool these stakeholders are able to anticipate market reactions to Federal Reserve communications. We seek to empower these groups to make informed decisions that will lead to profitable investments and more stable and predictable financial markets. As these stakeholders utilize this predictive capability it will help in risk management. As a result, investors are able to better hedge against potential market downturns or to profit from expected upswings.
Our analysis offers value to these policymakers by providing feedback on how their Federal Reserve communications are interpreted by the market. This could lead to a more effective and transparent monetary policy coming from the Federal Reserve. As these policymakers understand the nuances of how the market will react it can impact the specific language that is used and these announcements. This will also guide future communication strategies and enable a clearer and more predictable standard of guidance for these economic indicators.
However, this project also raises important ethical considerations. One concern is over the potential misuse of the predictive models that we develop. For example, if this tool was used by a select group of individuals or institutions this could exasperate existing inequalities in the financial markets. This is particularly true if such individuals or institutions have substantial financial resources to move these financial markets. This ability to predict market movements based on the federal reserve's announcement could lead to an unfair advantage for these individuals. Those with access to this advanced predictive tool may exploit these insights at the expense of their less informed counterparts.
Additionally, there is also an ethical question of transparency. The Federal Reserve seeks to manage the economy in a way that benefits the broader public. Their communications are intended to be openly interpreted. If predictive models like ours become widely used this may create pressure for the Federal Reserve to selectively alter the way that they communicate. This could lead to less transparency and more strategic messaging on the part of the Federal Reserve. In conclusion, our project offers substantial benefits. This includes enhancing decision-making processes for investors and policymakers. This tool also requires careful consideration of its broader impact. This will ensure that the tool we developed is used ethically and will not contribute to greater market inequalities.
Statement of Work¶
- Chaim Nechamkin: Hypothesis 2, Initial Project Idea
- Darryl Joyner: Background Research/Analyst, Video Editing
- Joshua Fisher: Hypothesis 1, Code Management
Incorporating Feedback¶
Concern and emphasis about minimizing the "black-box" problem was raised in our project formulation and in on standup meetings. In response, we emphasised interpretability of our results.
References Cited¶
Bernanke, B. S., & Kuttner, K. N. (2005). What Explains the Stock Market's Reaction to Federal Reserve Policy? Journal of Finance, 60(3), 1221-1257.
Devlin, J., Chang, M.-W., Lee, K., Google, K., & Language, A. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://arxiv.org/pdf/1810.04805
D’Amico, S., & King, T. B. (2021). The Language of Federal Reserve Communications. Chicago Fed Letter.
Niemira, M. P., & Klein, P. A. (1994). Forecasting Financial and Economic Cycles. Wiley.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017, June 12). Attention Is All You Need. ArXiv.org. https://arxiv.org/abs/1706.03762